Robust mixture clustering using Pearson type VII distribution

نویسندگان

  • Jianyong Sun
  • Ata Kabán
  • Jonathan M. Garibaldi
چکیده

A mixture of Student t-distributions (MoT) has been widely used to model multivariate data sets with atypical observations, or outliers for robust clustering. In this paper, we developed a novel robust clustering approach bymodeling the data sets usingmixture of Pearson type VII distributions (MoP). An EM algorithm is developed for the maximum likelihood estimation of the model parameters. An outlier detection criterion is derived from the EM solution. Controlled experimental results on the synthetic datasets show that theMoP ismore viable than theMoT. TheMoP performs comparably if not better, on average, in terms of outlier detection accuracy and out-of-sample log-likelihood with the MoT. Furthermore, we compared the performances of the Pearson type VII and the student t mixtures on the classification of several real pattern recognition data sets. The comparison favours the developed Pearson type VII mixtures. 2010 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Mixture Modeling with Pearson Type VII Distribution

Mixture of Student t-distribution (MoT) has been widely used to model multivariate data sets with atypical observations, or outliers for robust clustering. In this paper, we developed a novel robust clustering approach by modeling the data sets with mixture of Pearson type VII distribution (MoP). An EM algorithm is developed for the maximum likelihood estimation of the model parameters. Outlier...

متن کامل

A new family of multivariate heavy-tailed distributions with variable marginal amounts of tailweight: application to robust clustering

We propose a family of multivariate heavy-tailed distributions that allow variable marginal amounts of tailweight. The originality comes from introducing multidimensional instead of univariate scale variables for the mixture of scaled Gaussian family of distributions. In contrast to most existing approaches, the derived distributions can account for a variety of shapes and have a simple tractab...

متن کامل

Generalized Birnbaum-Saunders Distribution

The two-parameter Birnbaum–Saunders (BS) distribution was originally proposed as a failure time distribution for fatigue failure caused under cyclic loading. BS model is a positively skewed statistical distribution which has received great attention in recent decades. Several extensions of this distribution with various degrees of skewness, kurtosis and modality are considered. In particular, a...

متن کامل

Matrix Kummer-Pearson VII Relation and Polynomial Pearson VII Configuration Density

Abstract. A case of the matrix Kummer relation of Herz (1955) based on the Pearson VII type matrix model is derived in this paper. As a con- sequence, the polynomial Pearson VII configuration density is obtained and this sets the corresponding exact inference as a solvable aspect in shape theory. An application in postcode recognition, including a nu- merical comparison between the exact poly...

متن کامل

An Optimal Unsupervised Satellite image Segmentation Approach Based on Pearson System and k-Means Clustering Algorithm Initialization

This paper presents an optimal and unsupervised satellite image segmentation approach based on Pearson system and k-Means Clustering Algorithm Initialization. Such method could be considered as original by the fact that it utilised K-Means clustering algorithm for an optimal initialisation of image class number on one hand and it exploited Pearson system for an optimal statistical distributions...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Pattern Recognition Letters

دوره 31  شماره 

صفحات  -

تاریخ انتشار 2010